TMA: Tera‐MACs/W neural hardware inference accelerator with a multiplier‐less massive parallel processor
نویسندگان
چکیده
Computationally intensive inference tasks of deep neural networks have brought about a revolution in accelerator architecture, aiming to reduce power consumption as well latency. The key figure-of-merit hardware accelerators is the number multiply-and-accumulation operations per watt (MACs/W); state-of- the-art MACs/W, so far, has been several hundreds Giga-MACs/W. We propose Tera- MACS/W (TMA) with 8-bit activations and scalable integer weights less than 1-byte. architecture's main feature configurable processing element for matrix-vector operations. proposed uses multiplier-less massive parallel processor that works without multipliers, which makes it attractive energy efficient high-performance network applications. benchmark our system's latency, power, performance using Alexnet trained on ImageNet. Finally, we compared accelerator's throughput prior works. outperforms state-of-the-art counterparts, terms area efficiency, achieving 2.3 TMACs/[email protected] V 28-nm Virtex-7 FPGA chip.
منابع مشابه
Towards a Low Power Hardware Accelerator for Deep Neural Networks
In this project, we take a first step towards building a low power hardware accelerator for deep learning. We focus on RBM based pretraing of deep neural networks and show that there is significant robustness to random errors in the pre-training, training and testing phase of using such neural networks. We propose to leverage such robustness to build accelerators using low power but possibly un...
متن کاملNn-X - a hardware accelerator for convolutional neural networks
Gokhale, Vinayak A. M.S.E.C.E, Purdue University, August 2014. nn-X A Hardware Accelerator for Convolutional Neural Networks. Major Professor: Eugenio Culurciello. Convolutional neural networks (ConvNets) are hierarchical models of the mammalian visual cortex. These models have been increasingly used in computer vision to perform object recognition and full scene understanding. ConvNets consist...
متن کاملA Hardware Implementation of a Binary Neural Image Processor
This paper presents the work that has resulted in the SAT processor; a dedicated hardware implementation of a binary neural image processor. The SAT processor is aimed speciically at supporting the ADAM algorithm and is currently being integrated into a new version of the C-NNAP parallel image processor. The SAT processor performs binary matrix multiplications, a task that is computation-ally c...
متن کاملArtificial Neural Networks Processor - A Hardware Implementation Using a FPGA
Several implementations of Artificial Neural Networks have been reported in scientific papers. Nevertheless, these implementations do not allow the direct use of off-line trained networks because of the much lower precision when compared with the software solutions where they are prepared or modifications in the activation function. In the present work a hardware solution called Artificial Neur...
متن کاملNew Hardware for Massive Neural Networks
Transient phenomena associated with forward biased silicon p +-n-n + structures at 4.2K show remarkable similarities with biological neurons. The devices play a role similar to the two-terminal switching elements in Hodgkin-Huxley equivalent circuit diagrams. The devices provide simpler and more realistic neuron emulation than transistors or op-amps. They have such low power and current require...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Circuit Theory and Applications
سال: 2021
ISSN: ['0098-9886', '1097-007X']
DOI: https://doi.org/10.1002/cta.2917